System and method for dynamic equalization of audio data

ABSTRACT

A system for processing audio data is disclosed that includes a plurality of gain adjustment devices, each gain adjustment device having an associated audio input frequency band. A plurality of control signal processing systems are configured to receive audio input data for one of the associated audio input frequency bands and to generate a gain adjustment device control signal. The gain adjustment device control signal is configured to decrease a gain setting of an associated gain adjustment device for a predetermined period of time as a function of a transient in the associated audio input frequency band.

RELATED APPLICATIONS

The present application claims priority to and benefit of U.S.Provisional Patent Application No. 62/092,603, filed on Dec. 16, 2014,U.S. Provisional Patent Application No. 62/133,167, filed on Mar. 13,2015, U.S. Provisional Patent Application No. 62/156,061, filed on May1, 2015, and U.S. Provisional Patent Application No. 62/156,065, filedon May 1, 2015, each of which are hereby incorporated by reference forall purposes as if set forth herein in their entirety.

TECHNICAL FIELD

The present disclosure relates generally to audio data processing, andmore specifically to a system and method for dynamic equalization ofaudio data that reduces audio energy processing consumption.

BACKGROUND OF THE INVENTION

Equalization of audio data is used to control the relative gain offrequency components of the audio data, such as to boost low frequencycomponents, middle frequency components or high frequency components.

SUMMARY OF THE INVENTION

A system for processing audio data is disclosed that includes aplurality of gain adjustment devices, each gain adjustment device havingan associated audio input frequency band. A plurality of control signalprocessing systems are configured to receive audio input data for one ofthe associated audio input frequency bands and to generate a gainadjustment device control signal. The gain adjustment device controlsignal is configured to decrease a gain setting of an associated gainadjustment device for a predetermined period of time as a function of atransient in the associated audio input frequency band.

Other systems, methods, features, and advantages of the presentdisclosure will be or become apparent to one with skill in the art uponexamination of the following drawings and detailed description. It isintended that all such additional systems, methods, features, andadvantages be included within this description, be within the scope ofthe present disclosure, and be protected by the accompanying claims.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

Aspects of the disclosure can be better understood with reference to thefollowing drawings. The components in the drawings are not necessarilyto scale, emphasis instead being placed upon clearly illustrating theprinciples of the present disclosure. Moreover, in the drawings, likereference numerals designate corresponding parts throughout the severalviews, and in which:

FIG. 1 is a diagram of a system for dynamic equalization of audio data,in accordance with an exemplary embodiment of the present disclosure;

FIG. 2 is a diagram of a system for controlling dynamic equalization ofaudio data, in accordance with an exemplary embodiment of the presentdisclosure;

FIG. 3 is a diagram of an algorithm for dynamic equalization of audiodata, in accordance with an exemplary embodiment of the presentdisclosure;

FIG. 4 is a diagram of a system for parametric stereo processing, inaccordance with an exemplary embodiment of the present disclosure; and

FIG. 5 is a diagram of an algorithm for parametric stereo processing, inaccordance with an exemplary embodiment of the present disclosure.

DETAILED DESCRIPTION OF THE INVENTION

In the description that follows, like parts are marked throughout thespecification and drawings with the same reference numerals. The drawingfigures might not be to scale and certain components can be shown ingeneralized or schematic form and identified by commercial designationsin the interest of clarity and conciseness.

Audio data can include events that are louder than other events, such asgunshots, cymbal crashes, drum beats and so forth. When these eventsoccur, they mask audio data that is 13 dB lower in gain for a period oftime (typically around 200 milliseconds), such as audio data that hasthe same frequency components as the frequency components of the event.This masking occurs as a result of the psychoacoustic processes relatedto hearing. However, even though the masked audio signals cannot beperceived, the nerve cells in the organ of Corti are still receiving themasked audio signals, and are using energy to process them. Thisadditional energy use results in a loss of hearing sensitivity. As such,the audio processing system that amplifies such signals is not onlywasting energy on amplification of signals that are not perceived by thelistener, it is also wasting that energy to create an inferior listeningexperience.

By detecting such transient events and dynamically equalizing the audiodata to reduce the audio signals that will be masked, the amount ofenergy consumed by the audio processing system can be reduced, which canresult in longer battery life. In addition, the effect of such maskedaudio signals on the nerves in the organ of Corti can be reduced oreliminated, which results in an improved audio experience for thelistener.

FIG. 1 is a diagram of a system 100 for dynamic equalization of audiodata, in accordance with an exemplary embodiment of the presentdisclosure. System 100 can be implemented in hardware or a suitablecombination of hardware and software.

As used herein, “hardware” can include a combination of discretecomponents, an integrated circuit, an application-specific integratedcircuit, a field programmable gate array, or other suitable hardware. Asused herein, “software” can include one or more objects, agents,threads, lines of code, subroutines, separate software applications, twoor more lines of code or other suitable software structures operating intwo or more software applications, on one or more processors (where aprocessor includes a microcomputer or other suitable controller, memorydevices, input-output devices, displays, data input devices such as akeyboard or a mouse, peripherals such as printers and speakers,associated drivers, control cards, power sources, network devices,docking station devices, or other suitable devices operating undercontrol of software systems in conjunction with the processor or otherdevices), or other suitable software structures. In one exemplaryembodiment, software can include one or more lines of code or othersuitable software structures operating in a general purpose softwareapplication, such as an operating system, and one or more lines of codeor other suitable software structures operating in a specific purposesoftware application. As used herein, the term “couple” and its cognateterms, such as “couples” and “coupled,” can include a physicalconnection (such as a copper conductor), a virtual connection (such asthrough randomly assigned memory locations of a data memory device), alogical connection (such as through logical gates of a semiconductingdevice), other suitable connections, or a suitable combination of suchconnections.

System 100 includes crossover 102, which receives audio data andprocesses the audio data to generate separate frequency bands of audiodata. In one exemplary embodiment, crossover 102 can generate a firstband having a frequency range of 0-50 Hz, a second band having afrequency range of 50-500 Hz, a third band having a frequency range of500-4500 Hz and a fourth band having a frequency range of 4500 Hz andabove, or other suitable numbers of bands and associated frequencyranges can also or alternatively be used. The input to crossover 102 canbe an unprocessed audio signal, a normalized audio signal or othersuitable audio signals.

The outputs of crossover 102 are further filtered using associatedfilters, such as low pass filter 104, low mid pass filter 106, mid passfilter 108 and high pass filter 110, or other suitable filters. Inaddition, the high frequency band can be further processed to addharmonic components, such as to compensate for lossy compressionprocessing of the audio data that can result in audio data having anarrow image width and sparse frequency components. In one exemplaryembodiment, the harmonic components can be added using clipping circuit112, which generates harmonic components by clipping the high frequencycomponents of the audio data. High pass filter 114 is used to removelower frequency harmonic components, and scaler 116 is used to controlthe magnitude of the harmonic processed audio that is added to theunprocessed audio at adder 138. Control of scaler 116 is provided bycrossover 118, which can generate a high frequency band output forfrequencies above a predetermined level, such as 8000 Hz, and a lowfrequency band output for frequencies below the predetermined level. TheRMS values of the high and low frequency bands are generated by RMSprocessors 120 and 122, and the RMS values are then converted fromlinear values to log values by DB20 124 and DB20 126, respectively. Thedifference between the high and low frequency components is thendetermined using subtractor 128, and a value from table 130 is used todetermine the amount of high frequency harmonic frequency componentsignal to be added to the unprocessed high frequency audio signal. Inone exemplary embodiment, the amount can be set to zero until there is a6 dB difference between the low and high frequency components, and asthe difference increases from 6 dB to 10 dB, the amount of highfrequency harmonic frequency component signal that is added to theunprocessed high frequency audio signal can increase from 0 dB to 8 dB.As the difference between the low and high frequency componentsincreases from 10 dB to 15 dB, the amount of high frequency harmonicfrequency component signal that is added to the unprocessed highfrequency audio signal can increase from 8 dB to 9 dB. Likewise, othersuitable amounts of high frequency harmonic frequency component signalcan be added to the unprocessed high frequency audio signal. Increasingthe amount of high frequency harmonic frequency component signal that isadded to the unprocessed high frequency audio signal as a function ofthe change in the relative content of low and high frequency componentsof the high frequency band can be used to improve audio quality, becausethe difference is indicative of a sparse audio signal. The additionalharmonic content helps to improve a sparse audio signal by providingadditional frequency components that are complementary to the audiodata. The high frequency harmonic components are then added to theunprocessed high frequency components by adder 138.

Equalization of the low pass frequency component is accomplished usingscaler 132 under control of an input A, equalization of the mid passfrequency component is accomplished using scaler 136 under control of aninput B, and equalization of the high pass frequency component isaccomplished using scaler 140 under control of an input C. The outputsof the equalized audio components and the unprocessed low-mid passoutput 106 are combined using adder 142 to generate an output.

In operation, system 100 performs dynamic equalization to reduce powerloss and also improves audio signal quality. The psychoacoustic maskingprocesses result in a 150 to 200 millisecond loss in perception when amasking input having a crest of 13 dB or more is generated, due to thereaction of kinocilia to such audio inputs. When such transients occur,then maintain or increasing audio during the dead zone that follows thetransient only serves to increase the power consumed by the audioprocessing system without increasing audio quality. In addition, whilethe audio input is not resulting in nerve signals that ultimately reachthe listener's brain, processing of that audio energy still requireswork to be done by kinocilia in the organ of Corti, and can alsoincrease the amount of additional energy that is required in order togenerate a perceptible response. By dynamically equalizing the audiodata to reduce the gain during such periods, system 100 helps to reduceboth the amount of energy required to process the audio data as well asthe amount of energy required by the listener to listen to the audiodata. In addition, adding harmonic frequency content to the highfrequency audio data when the total audio data content is sparse helpsto improve the perceived audio quality, by providing additionalfrequency components in the sparse audio data that complement theexisting frequency components.

FIG. 2 is a diagram of a system 200 for controlling dynamic equalizationof audio data, in accordance with an exemplary embodiment of the presentdisclosure. System 200 can be implemented in hardware or a suitablecombination of hardware and software.

System 200 includes automatic gain control core 202 and automatic gaincontrol multiplier 204, which are configured to receive an audio signalinput and to generate a normalized audio signal output. The normalizedaudio signal output of AGC multiplier 204 can also be provided tocrossover 102 or other suitable systems or components.

Filter 206 can be a band pass filter having a frequency range of 40 to80 Hz or other suitable filters. The output from filter 206 is processedby RMS processor 208 to generate a signal that represents the RMS valueof the output of filter 206. Derivative processor 210 receives the bandpass RMS value and generates an output that represents the rate ofchange of the band pass RMS value. Downward expander 212 is used toprevent dynamic equalization of the associated frequency band when thereis no associated transient occurring in the frequency band.

Filter 214 can be a band pass filter having a frequency range of 500 to4000 Hz or other suitable filters. The output from filter 214 isprocessed by RMS processor 216 to generate a signal that represents theRMS value of the output of filter 214. Derivative processor 218 receivesthe band pass RMS value and generates an output that represents the rateof change of the band pass RMS value. Downward expander 220 is used toprevent dynamic equalization of the associated frequency band when thereis no associated transient occurring in the frequency band.

Filter 222 can be a high pass filter having a frequency range of 4000 Hzand above or other suitable filters. The output from filter 222 isprocessed by RMS processor 224 to generate a signal that represents theRMS value of the output of filter 222. Derivative processor 226 receivesthe band pass RMS value and generates an output that represents the rateof change of the band pass RMS value. Downward expander 228 is used toprevent dynamic equalization of the associated frequency band when thereis no associated transient occurring in the frequency band.

In operation, system 200 generates control inputs for a dynamicequalizer, by detecting transients in frequency bands associated withthe dynamic equalization. System 200 thus helps to improve powerconsumption for an audio data processor, and also helps to improve theperceptual audio quality.

FIG. 3 is a diagram of an algorithm 300 for dynamic equalization ofaudio data, in accordance with an exemplary embodiment of the presentdisclosure. Algorithm 300 can be implemented in hardware or a suitablecombination of hardware and software. Although algorithm 300 is shown asa flow chart, other suitable programming paradigms such as statediagrams and object-oriented programming can also or alternatively beused.

Algorithm 300 begins at 302, where audio data is received and processed,such as to generate a normalized audio signal by using a first adaptivegain control processor that is used to remove a DC signal component anda second adaptive gain control processor that receives the output of thefirst adaptive gain control processor and the input audio, or in othersuitable manners. The algorithm then proceeds to 304, where the audiodata is filtered to generate different bands of audio data. Thealgorithm then proceeds in parallel to 306 and 314.

At 306, the high and low frequency components of one or more of thebands of audio data are analyzed, such as to determine whether there isa greater RMS value of one component compared to the other. Thealgorithm then proceeds to 308, where it is determined whether thedifference is indicative of an audio signal that is sparse or thatotherwise would benefit from additional harmonic content, such as byexceeding a predetermined level. If the difference does not exceed thelevel, the algorithm proceeds to 318 where the unprocessed signal isdynamically equalized, otherwise the algorithm proceeds to 310, whereharmonic content is generated. In one exemplary embodiment, harmoniccontent can be generated by clipping the audio signal and then filteringthe clipped signal to remove predetermined harmonic frequencycomponents, or in other suitable manners. The algorithm then proceeds to312, where the harmonic content is added to unprocessed audio signal andthe combined signal is processed at 318.

At 314, the audio signals are processed to determine whether a transienthas occurred, such as by generating a derivative of an RMS value of thefrequency component or in other suitable manners. A downward expander orother suitable components or processes can be used to ensure that thecontrol signal is only generated for significant transients that willcause an associated psychoacoustic masking of predetermined associatedaudio frequency components. The algorithm then proceeds to 316, where acontrol signal is generated based on the transient. The control signalis applied to the audio signal to perform dynamic equalization at 318,such as to reduce the gain of the audio signal frequency components whena masking transient occurs, to both reduce power consumption andlistener fatigue, and to improve audio quality to the listener.

In operation, algorithm 300 allows audio data to be dynamicallyequalized, by detecting masking transients and by using such maskingtransients to generate dynamic equalization controls. By dynamicallyequalizing the audio data that would otherwise not be perceptible to thelistener, the amount of energy required to process the audio data can bereduced, and the perceived quality of the audio data can be improved.

FIG. 4 is a diagram of a system 400 for parametric stereo processing, inaccordance with an exemplary embodiment of the present disclosure.System 400 can be implemented in hardware or a suitable combination ofhardware and software.

System 400 includes time to frequency conversion system 402, whichconverts frames of a time-varying audio signal into frames of frequencycomponents, such as by performing a fast-Fourier transform or in othersuitable manners.

Bin comparison system 404 receives the frames of frequency domain dataand compares the magnitude of the left channel audio data with themagnitude of the right channel audio data for each frequency bin.

Phase adjustment system 406 receives the comparison data for each bin offrequency data from bin comparison system 404 and sets the phase of theright channel frequency bin component equal to the phase of the leftchannel frequency bin component if the magnitude of the left channelfrequency bin component is greater than the magnitude of the rightchannel frequency bin component. The output of phase adjustment system406 is parametric audio data.

Surround processing system 408 receives the parametric audio data andgenerates surround audio data. In one exemplary embodiment, surroundprocessing system 408 can receive speaker location data and cancalculate a phase angle difference for the input audio data thatcorresponds to the location of the speaker. In this exemplaryembodiment, surround processing system 408 can generate audio data forany suitable number of speakers in any suitable locations, by adjustingthe phase angle of the parametric audio data to reflect the speakerlocation relative to other speakers.

In operation, system 400 allows audio data to be processed to generateparametric audio data, which can then be processed based onpredetermined speaker locations to generate N-dimensional audio data.System 400 eliminates the phase data of the input audio data, which isnot needed when the input audio data is processed to be output fromspeakers in non-stereophonic speaker locations.

FIG. 5 is a diagram of an algorithm 500 for parametric stereoprocessing, in accordance with an exemplary embodiment of the presentdisclosure. Algorithm 500 can be implemented in hardware or a suitablecombination of hardware and software. Although algorithm 500 is shown asa flow chart, other suitable programming paradigms such as statediagrams and object-oriented programming can also or alternatively beused.

Algorithm 500 begins at 502 where audio data is received, such as analogor digital audio data in the time domain. The algorithm then proceeds to504.

At 504, the audio data is converted from the time domain to thefrequency domain, such as by performing a fast Fourier transform on theaudio data or in other suitable manners. The algorithm then proceeds to506.

At 506, it is determined whether the magnitude of a left channelfrequency component of the frequency domain audio data is greater thanthe magnitude of the associated right channel frequency component. Inone exemplary embodiment, 506 can be performed on a frequency componentbasis for each of the frequency components of the audio data, or inother suitable manners. If it is determined that the magnitude of a leftchannel frequency component of the frequency domain audio data is notgreater than the magnitude of the associated right channel frequencycomponent, the algorithm proceeds to 510, otherwise the algorithmproceeds to 508, where the phase of the right channel frequencycomponent is replaced with the phase of the left channel frequencycomponent. The algorithm then proceeds to 510.

At 510, the audio data is processed for an N-channel surround playbackenvironment. In one exemplary embodiment, the locations of each of aplurality of speakers can be input into a system which can thendetermine a preferred phase relationship of the left and right channelaudio data for that speaker. The phase and magnitude of the audio datacan then be generated as a function of the speaker location, or othersuitable processes can also or alternatively be used.

It should be emphasized that the above-described embodiments are merelyexamples of possible implementations. Many variations and modificationsmay be made to the above- described embodiments without departing fromthe principles of the present disclosure. All such modifications andvariations are intended to be included herein within the scope of thisdisclosure and protected by the following claims.

1-19. (canceled)
 20. A system for processing audio data comprising: aclipping system configured to receive audio input data and to generate aclipped audio input signal; a high pass filter coupled to the clippingsystem and configured to receive the clipped audio input signal and togenerate high-pass filtered clipped audio data; a scaler multipliercoupled to the high pass filter and configured to multiply the high-passfiltered clipped audio data by a predetermined value; and a tablecoupled to the scaler multiplier and configured to receive a differencesignal and to select the predetermined value as a function of thedifference signal for use in generating an audio output signal.
 21. Thesystem of claim 20 further comprising one or more gain adjustmentdevices, each gain adjustment device having an associated audio inputfrequency band, wherein a gain adjustment device control signal isconfigured to decrease a gain setting of an associated gain adjustmentdevice for a predetermined period of time as a function of a transientin the associated audio input frequency band.
 22. The system of claim 20further comprising a plurality of control signal processing systems,each control signal processing system configured to receive audio inputdata for an associated audio input frequency band and to generate a gainadjustment device control signal for use in generating the audio outputsignal.
 23. The system of claim 20 further comprising a crossover filtercoupled to a subtractor, the crossover filter configured to generate alow frequency output and a high frequency output and the subtractorconfigured to receive the low frequency output and the high frequencyoutput and to generate the difference signal.
 24. The system of claim 23further comprising an RMS converter coupled between the crossover filterand the subtractor.
 25. The system of claim 23 further comprising alinear to log converter coupled between the crossover filter and thesubtractor.
 26. The system of claim 20 further comprising an addercoupled to the table and one of the gain adjustment devices.
 27. Amethod for processing audio data comprising: generating a clipped audioinput signal from audio input data with a clipping system; generatinghigh-pass filtered clipped audio data from the clipped audio inputsignal using a high pass filter coupled to the clipping system;multiplying the high-pass filtered clipped audio data by a predeterminedvalue using a scaler multiplier coupled to the high pass filter; andreceiving a difference signal at a table coupled to the scalermultiplier and selecting the predetermined value as a function of thedifference signal to generate an output audio signal.
 28. The method ofclaim 27 further comprising: generating a low frequency output and ahigh frequency output with a crossover filter coupled to the subtractor;and receiving the low frequency output and the high frequency output atthe subtractor and generating the difference signal.
 29. The method ofclaim 28 further comprising coupling an RMS converter between thecrossover filter and the subtractor.
 30. The method of claim 28 furthercomprising coupling a linear to log converter between the crossoverfilter and the subtractor.
 31. The method of claim 27 further comprisingcoupling an adder to the table and one of the gain adjustment devices.32. A non-transitory computer-readable medium encoded withcomputer-executable instructions that when executed by one or moreprocessors cause the processor to: generate a clipped audio input signalfrom audio input data with a clipping system; generate high-passfiltered clipped audio data from the clipped audio input signal using ahigh pass filter coupled to the clipping system; multiply the high-passfiltered clipped audio data by a predetermined value using a scalermultiplier coupled to the high pass filter; and receive a differencesignal at a table coupled to the scaler multiplier and select thepredetermined value as a function of the difference signal to generatean output audio signal.
 32. The non-transitory computer-readable mediumof claim 31 wherein the instructions cause the processor to: generate alow frequency output and a high frequency output with a crossover filtercoupled to a subtractor; and receive the low frequency output and thehigh frequency output at the subtractor and generate the differencesignal.
 33. The non-transitory computer-readable medium of claim 31wherein the instructions cause the processor to couple an RMS converterbetween a crossover filter and a subtractor.
 34. The non-transitorycomputer-readable medium of claim 31 wherein the instructions cause theprocessor to couple a linear to log converter between a crossover filterand a subtractor.
 35. The non-transitory computer-readable medium ofclaim 31 wherein the instructions cause the processor to couple an adderto the table and a gain adjustment device.